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ESTIMATION  OF  PARAMETERS  IN  SYSTEMS  OF  RELATIVELY  SLOW  TIME  VARIATION* 


Philip  Hr  Fiske 
William  L.  Root 

The  University  of  Michigan,  Ann  Arbor 


ABSTRACT 

The  estimation  of  parameters  for  a rather  general  class  of  nonlinear, 
time-varying,  causal,  bounded  memory  systems  is  discussed.  A model  for  such 
systems  is  established  using  the  concept  that  the  system  variation  is  un- 
known, but  bounded  between  observation  times.  A recursive  estimation  algo- 
rithm is  developed  for  the  model,  and  both  upper  and  lower  bounds  are  found 
for  the  guaranteed  error  of  estimation. 


DEVELOPMENT  OF  THE  SYSTEM  MODEL 

In  this  paper  the  term  system  means  simply  a mapping  from  an  appropri- 
ate space  of  inputs  to  an  appropriate  space  of  outputs.  The  central  purpose 
of  the  paper  is  to  obtain  estimates  on  how  rapidly  causal  systems,  of  a cer- 
tain rather  general  class  that  includes  nonlinear  as  well  as  linear  systems, 
can  vary  with  time  and  still  allow  adequate  experimental  identification. 

A particular  kind  of  system  model  is  especially  appropriate  for  this 
study  and  is  used  here.  It  has  the  features  that  it  is  linear  in  the  param- 
eters, even  though  the  system  itself  may  be  nonlinear,  and  that  it  is  widely 
applicable  if  the  only  noise  to  be  considered  is  output  observation  noise. 

We  briefly  and  heuristicaliy  describe  such  a model  in  a special  case  in  what 
follows,  and  this  is  sufficient  for  this  paper.  However  there  is  a general 
theory  of  such  models,  and  most  of  the  ad  hoc  assumptions  made  below  can 
either  be  shown  to  be  justifiable  or  not  needed  (see  [1]). 

LetX  be  an  arl  itrary  input  space,  a linear  output  space,  ~..d  H a set 

of  mappings  from  ^ into  ^ (i.e.,H  is  a class  of  "systems"),  so  that  one  may 

write 

y = h(x) , xeX,y€*lj.,h€\.  (l) 

Each  x el  determines  a function  from"^  into  ^ . If  this  function  is  de- 
noted by  X,  then 

y = X(h) , h e \(  . (2) 

Let  addition  and  scalar  multiplication  be  defined  in  in  the  way  they  usu- 
ally are  for  functions  and  then  extend  "H  , if  necessary,  so  that  it  is  closed 
under  linear  combinations.  Denote  this  linear  extension  by  ^ . It  is  now 
trivial  to  verify  that  the  mapping  X as  extended  to  will  De  linear,  even 


♦Research  sponsored  by  the  United  States  Air  Force,  Air  Force  Office  of 
Scientific  Research  Grant  No.  72-2J28-B. 

-/  - 


though  the  mapping  h may  be  nonlinear  (see  e.g.  , [1]). 
r.oise  is  present,  (l)  and  are  replaced  by 

If  auditive  output 

y = h(x)  + v 

(5) 

and 

y = Xh  + v,  h c *5'  , 

(4) 

where  v is  a random  variable  taking  values  in  and  X is  linear.  Thus  the 
problem  of  identifying  the  system  has  been  put  in  the  form  of  the  classical 
problem  of  estimating  Ji  in  the  linear  model  (4).  Of  course,  equation  (4)  is 
abstract,  so  we  now  need  to  specialize  it  so  as  to  have  a meaningful  estima- 
tion problem,  here  we  simply  assume  what  we  need  in  order  to  do  this  easily. 

Suppose  that  a system  with  vector-valued  inputs  and  outputs  can  be  ade- 
quately represented  in  terms  of  sampled  inputs  and  outputs.  Futhertnore,  as- 
sume that  it  has  bounded  memory  ^this  is  the  only  critical  assumption). 

Then  an  input  is  a vector  sequence  (xk),  the  corresponding  output  is  a vec- 
tor sequence  {yk ) , and  if  the  system  has  memory  m, 


or 


where 


yk  ’ Vll  + V k “ 1 2'-"’ 

d . \ 

x,  = , ,•  • • ,x  ). 

-k  k k-1  k-m 


(5) 


If  the  system  is  time-invariant  each  h^  is  the  same;  otherwise,  of  course, 
the  hk's  are  different.  To  each  x^  there  corresponds  a linear  mapping  Xk 
defined,  as  before,  by  X^h^  d h^(x^),  so  that 

yk  = \\  + V ■ <61 


Let  there  be  imposed  the  further  condition  that  all  the  mappings  hk  e 'J'  are 
fully  characterized  by  a finite  set  of  parameters,  say  p in  number.  The 
(hk)  can  then  be  represented  by  p-vectors,  and  if  the  output  is  an  r-vector, 
the  (Xk)  can  be  represented  by  r x p matrices. 

If  the  systems  in  are  of  uniformly  bounded  time  variation  for  each 
finite  interval,  one  can  write 


Vi  = \ + V k = 


(7) 


where  each  component  of  w^  is  bounded.  The  system  model  then  becomes 


f \+l  = \ + \ 

\ ::  \hk  + k = > (8) 


where  the  {wk}  are  bounded,  unknown  elements  of  and  the  (vk)  are  random 
variables  representing  output  noise.  The  equations  (8)  are,  of  course,  in 
the  form  of  the  system  equations  for  a degenerate  linear  dynamical  system, 
but  the  interpretation  given  to  them  is  entirely  different. 


We  wish  to  consider  successive  blocks  of  measurements  where  each  block 
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.i  vVltcUiiiSzfUiillhL 
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consists  of  tests  with  the  same  suitably  chosen  N inputs  {x^, We 
make  the  important  simplifying  assumption  that  the  system  parameters  change 
only  between  blocks  of  measurements.  Results  can  be  obtained  without  this 
assumption  (see  [}]),  but  there  is  not  space  to  go  into  the  additional  argu- 
ment necessary  here.  This  assumption  gives  an  acceptable  approximation  any- 
way in  many  cases  where  there  is  only  slight  variation  within  the  blocks. 


With  this  assumption  we  have 
hk+l  = hk  + wk 


= Xhk  + vk,  k = 1,2,, 


> 


where 


hk  3 hM“d 


k 

y 

y(k-l)N+l 

y(k-l)N+2 

« 

, x a 

=r 

X2 

♦ 

4 

k 

> v 

- 

V(k-1)N+1 

V(k-l)N+2 

• 

• 

Jk N 

.V 

v 

LkN  J 

(9) 


The  condition  to  be  imposed  on  the  inputs  (xp, . . . ,xN } is  that  the  operator 
X has  zero  null  space,  so  that  the  vector  h^  is  estimable.  The  usual  as- 
sumptions are  made  about  the  noise, 


Evk  - 0,  Evk(vJ)T  * RS  , (10a) 

Jk 

k 

where  R is  strictly  definite.  The  boundedness  condil  ion  on  the  w is  taken 
to  be  in  the  form, 


(w  ,\|h)|  < Rk,  i = 1,  ■ 


(10b) 


for  all  k = 1,2,...,  where  (t. ) is  a complete  orthonorraal  basis  of  eigenvec- 
tors of  (x^'r-Ix)"1,  i.e. , 1 

<XTR  1x)  1^i  * i = 

2, 

The  eigenvalues  (a.,  ] are  real  and  nonnegati ve. 


ESTIMATION  Of  PARAMETERS 


We  now  derive  a recursive  estimation  algorithm  for  the  parameter  vec- 
tor h^  and  obtain  a uniform  upper  bound  on  the  mean-squared  error.  Since 
the  actual  mean-squared  error  depends  on  the  (w^),  which  are  unknown,  an 
exact  expression  for  mean-squared  error  cannot  be  given,  of  course. 

We  begin  by  forming  the  standard  linear,  unbiased,  minimum  variance 
(LUMV)  estimate  of  h^- 

h1  = Cy1,  ■ (11) 


where  C £ (XTR-1X.)“;1XTR"1. 


The  error  in  this  estimate  satisfies 
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P 2 
b a. 
>1  1 


(12) 


r r 


E||hj  - h1!!2 


2 ,1  2 


1 V.2 

>;  b . 

i-i  11 


e b-f^  Ho.,  i = Now  consider  the  second  observation 


2 2 2 1 1 2 
y - Xh  + v = Xh  + Xw  + v , 


which  can  be  rewritten  as 

PA]  ] a1  i 2 

y - Xl/  = X[(h  - h ) + w ']  + v . 


Lefine  z^  S y2  _ xh^  and  I'*-  ” (hi  - h^)  + w-*-  - h^  - h1  so  that  (12)  becomes 
z2  = x^1  + v2.  (i4; 

°i  i 2 

Let  be  a linear  estimate  of  4^  that  is  a function  of  Cz  , but  is  other- 
wise arbitrary.  Thus, 


41  = E a,  t(Cz‘‘, t .)*.  (1 

i,J=l  ^ J 1 

where  the  (apj)  are  real  numbers.  The  error  in  this  estimate  is  given  by 

l1  - t1  - ,s  (a..-i)u\*.)  + l a (e1,^  ) 

i--l  ii  i iJ  J 

L . <dA) 

p 2 

+ E ,(Cv  .f.)  (l 

j=l  x0  J 1 

Therefore,  it  follows  from  (10a)  and  the  properties  of  the  [tp]  that 


e||  l1  - 4]il2 


EH(a..-l)(6j  + 2 a (41,!  ) 2 

i-1  11  1 i=i  J 


d?  i) 


P 2 2 „ P 


2„r  / . S 2, 


+ 2 a?0?<2Ek,  >l)  E[(5  ,i)  ] (17) 

i,j=l  d “ i=l  11  1 

+ 2 g Ef  j?  a.  . + l a2.02 

i-1  J®!  1J  J i,j=l  1J  J 

JJfi)  J 

To  minimize  the  upper  bound  in  (1Y),  one  should  obviously  choose  ap^  = 0 for 
i / j.  Making  this  choice  yields 

E||  I1  - I1!!2  - S (a  - 1)2E[(41,^  )2]  + l a2  a2  (18) 

J*  1 X j_~l  11  x 

Now 

Eld1,^)2]  = E[(h1  - h1,ti)2]  + 2E[(hX  - h^fJtw1,^)]  + E[  (w^^fl 


^Obviously  E[ (h 
/ 0,  n > 2. 


< b2  + 2b  tj.  + t).  * = (b  + T).  ) , 

- li  lx  i x li  *i  ’ 

hX,'K  )(wX,t. )]  - 0,  but  in  general  E[(hn  - 


Cn  , \ t n , ' i 

h , t ) ( w , j ] 


so 


e||  I1  - l1ll?  < >: 

i=l 


<->  p p p— \ 

<aii  - J + V + au“i 


(15) 


The  upper  bound  in  (iy)  is  minimized  by  putting 
(bli  + Tli)2 


11 


(b  . + T).  )2  + O2 

ll  1 1 


i = 1,.  • • ,P- 


The  linear  estimate  that  results  when  these  values  are  used  is 


1 _ 


P (bn  + TJi)2  2 

2 — (Cz  H,* 


i l 


i~i  (bn  + m)2  + of 

Since  h2  = hb  + the  estimate  of  h2  is  taken  to  be 


(20) 


fi2 


01  , ^i 

h + s • 


(21) 


Then 


Fi|h2  - h2||2 


- eh  (ft1  + a1)  - (h*  + shf  - ii»r  - s 


,Al 


1 \ ii  2 


A 1 


In  2 


(bn  + Tli): 


-1 


_(bli  Ii)2  + of  J 


'(bn  + ili)J 


(22) 


(bli  + hi)‘ 


(bli  + tii)^  + of  J 


l b2 


2 

1 i=l(bli  + hi)2  +o2  i=l  21 

1 


where 


(bli  + ni)2o2 


2i  “ CblT+iu)2  V "o2 


) i ■*  • • • > P1 


If  this  process  is  continually  repeated,  we  obtain  the  estimation 
algorithm 


Ak-l  Ak-i 

h + i 


Ak-l 


P 


Z - (hx-l)l+  ^i)2qi  (Cy^  - rA ,♦.)♦. 
i=l(b(k-l)i  + Hi)2  + o2  11 


k Ak-l 


(25) 


ki 


(b(k-i)i+  V + °i 


The  initial  conditions  are  given  by 

, 2 


A1  1 
h = Cy  , 


li 


2 


i = 1 


(24) 


and  the  error  in  the  estimate  of  h^  satisfies 


ii  Ak  , k|,2  P 2 
Eli  h - h ||  < Z b 

i=l  K1 


(25) 


lemma  1.  If  f(x)  is  twice  continuously  differentiable  for  x > 0 and  satisfies 


a)  xLi£.  r(x)  > o 


. . c~ 

.J 


w 

\ ■! 


■“ ta aaii  a»a uasautimnsiL v.-.va-,;- 


fc.  ? 


(ii) 

iiV(x)  < » 

(iii) 

a((x)  > 0,  X > 0 

dx 

(iv) 

d2fiX-  < 0,  x > 0 
dx2 

(a)  f(x)  - x has 

(b)  the  sequence 

then 


Proof:  Follows  from  elementary  arguments, 

now  be  established. 


The  limiting  upper  bound  can 


Proposition  1.  If  x|  is  the  solution  of  the  equation 
(x1/2  + T):i  )2Qj 
(x1/2  + Tji)2  + a? 


= x, 


then  iim„b2.  = x*. 


n->-  oo 


ni 


Proof:  Define 


N d (x1/2  + Tu)2oi 

<X>  ' (xV2  + ni)2.0| 


and  note  that  <J’(h‘(n-l)i)  n b2^.  Result  follows  directly  from  application  of 
Lemma  1.  ||j  Hence,  use  of  the  estimation  a Igor  it  bn  in  (25)  yields  estimates 
[hk]  satisfying  ^l®)E||ft5c  - hk||2  < £ x£. 


LOWER  BOUNDS  ON  THE  ESTIMATION  ERROR 

We  now  derive  a lower  bound  on  the  attainable  mean-squared  error,  in 
the  sense  that  with  any  linear  estimator  the  error  can  ulways  be  as  large  as 
this  bound  if  cert  .in  parameter  changes  occur.  Of  course,  in  many  cases  the 
actual  error  may  be  less  than  this  lower  bound,  but  we  can  never  "guarantee" 
this  fact.  The  basic  idea  is  to  construct  a problem  in  which  the  system 
variations  are  random,  with  iown  mean  and  covariance,  but  bounded  as  in 
(10b).  Kalman  filtering  techniques  can  then  be  applied  in  order  to  find  the 
minimum  mean-squared  error  which  is  attainable  using  any  linear  estimation 
procedure.  This  error  provides  the  lower  bound  of  interest. 

Suppose  that  the  wk,s  are  random  vectors,  independent  of  the  v^'s, 
which  satisfy  (10b)  and  are  described  by 


and 


Then 


P[  (wK,  ^±)  = T)i]  = | = P[(«k,i)  = -r^], 
E[(wk,ti)(w/,^)]  = 

Ewk  = 0,  E*wk(wj)T  = Q6  , 

kJ 


(26) 


(27) 


where  Q can  be  found  from  knowledge  of  the  (\|g  } and  (26).  Suppose  also  that 
h^- is  a random  vector  with,  say,  mean  zero  and  covarianre  matrix  Pq  = 

-6  • 


mm*  1,1 


(x'i’ir1X)'J.  The  error  covariance  matrix  for  the  Kalman  estimate  fik  of  hk  is 
given  by  the  recursive  equation  (see  e.  g.  , [it]) 


(Pk_l  + Q) (I  - XT[X(Pk_i  + Q)Xt  + H]"1  * 

• X(Pk_L  + Q)),  k » 2,5,.*.  • 


(29) 


We  then  have: 


Proposition  2.  The  upper  bound  on  the  attainable  mean-squared  error  of  any 
linear  estimate  of  the  system  parameters  (hn),  in  the  model  described  by  (9) 
and  (10),  cannot  be  less  than  Tr  Pk  ^ e,  where  Pk  (k  > 2)  is  given  by 
(29)  and  l\  (XtR“1X)“1. 

Proof:  Suppose  the  (wk)  are  random  vectors  satisfying  (26)  and  hi  is  a ran- 

dom vector  with  mean  zero  and  covariance  Pq.  first  we  observe  that  if  only 
yP  is  available,  there  is  no  linear  estimator  that  will  yield  uniformly 
smaller  mean- squared  error  than  . This  is  because  P-[  was  chosen  to  be  the 
variance  of  the  LUMV  estimate  for  h1  in  the  equation  yl  = X)A  + vP,  ar.d  no 
biased  estimate  gives  u mean-squared  error  bounded  for  all  h-*-.  Next,  suppose 
that  for  m > 2 there  exists  a linear  estimator  Km  such  that 


E(||hra  - hm'| 2 


w1-, . . . ,wra"J']  < e 


for  every  sequence  (w1, . . . ,v;m_  }.  Then 
E||ftm  - h^|2  „ E{E{||fim  = hnl|| 2 


.w 


m-.l 


})  < e. 


But  this  contradicts  the  minimum  mean-squared  error  property  of  the  Kalman 
estimate  hm.  Consequently  , for  any  m,  there  is  a sequence  vA, . . . ,wm“^  such 
the  im  > e.  Since  [w1- , . . . ,v/Q"^)  satisfies  (10b),  this  proves  the  asser- 
tion. HI 

It  is  of  interest  to  interpret  this  result  in  the  special  case  of  (9) 
and  (lo)  where  X is  the  identity  and  R is  diagonal.  The  eigenvectors  ('J/q) 
then  become  the  standard  unit  vectors  in  RP  and  Q - diag(rj2,T|p, . . . ,Rp). 
Also,  with  a little  algebraic  manipulation  (29)  becomes 

pk  ‘ (pk-i + «)(pk-i + « + "P1”- 

Since  Ij  ~ R is  diagonal,  all  the  A will  be  diagonal.  In  particular,  if 

pk-l  = diafi(afk-l)l»°fk-l)2^*->ofk-i)p)  then  pk  ~~  diag(^1,a|2,...,a|p) 
where 

(a(k-l)i  + ng)crg  . _ 

p p p > 1 1, . . . , p. 

0C,  . . + q.  +0, 

(k-l)i  'x  1 


(50) 


Cor  oil. ary.  If  for  the  system  described  by  (9)  and  (10),  X is  the  identity 
and  R is  diagonal,  then  there  exists  a sequence  (wk)  satisfying  (10b)  such 
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for  any  linear  estimate  of  h\ 

Proof:  Define 
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and  note  that  for  equation  (30),  e(o^k_i)i)  = afp*  Application  of  Lemma  1 
yields  
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the  resuit  follows  directly  from  Proposition  2.  |||  Figure  1 provides  a plot 
of  the  i^h  component  of  the  upper  and  lower  bounds  on  the  mean-square  error 
when  X - I and  R is  diagonal. 
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Figure  i.  Bounds  on  Errors 


COMMENTS 

(1)  The  results  derived  for  estimating  the  parameters  when  they  vary  only 
between  blocks  of  measurements  can  be  extended  to  include  the  case  where 
the  parameters  vary  between  each  pair  of  measurements  (see  [3)). 

(2)  The  upper  and  lower  bound  expressions  given  by  Propositions  1 and  2 pro- 
vide criteria  for  determining  whether  or  not  a system  can  be  considered 
as  being  slowly  varying  (see  [3]). 

(3)  The  restriction  to  the  case  where  X is  the  identity  is  in  fact  less  re- 
strictive than  it  would  appear  to  be  at  first  glance.  The  "standard 


Muiclr.  ;-di 


e-repre^entation *'  developed  In  [l]  doe?  provide  such  a model  for  a very 
large  class  of  systems.  Of  course,  tnc  parametrization  provided  by  this 
model  may  not  be  what  is  desired  in  a practical  situation.  However,  our 
point  of  view  as  regards  to  the  concept  of  slowly  varying  is  that  if 
there  is  any  parametrization  for  which  sufficiently  good  estimates  can 
be  made,  then  the  system  is  slowly  varying. 

(4)  The  estimate  used  at  each  state  of  the  recursive  estimation  procedure 
is  essentially  '.he  modified  LUMV  estimate  described  in  [2]. 
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